A New Training Algorithm for Kanerva's Sparse Distributed Memory 



Lou Marvin Caraig 
Department of Systems and Computer Science 
University of Florence 
Email: loumarvincaraig.unifi@gmail.com 



Abstract — The Sparse Distributed Memory pro- 
posed by Pentii Kanerva (SDM in short) was thought 
to be a model of human long term memory. The archi- 
tecture of the SDM permits to store binary patterns 
and to retrieve them using partially matching patterns. 
However Kanerva's model is especially efficient only 
in handling random data. The purpose of this article 
is to introduce a new approach of training Kanerva's 
SDM that can handle efficiently non-random data, and 
to provide it the capability to recognize inverted pat- 
terns. This approach uses a signal model which is dif- 
ferent from the one proposed for different purposes by 
Hely, Willshaw and Hayes in [4]. This article addition- 
ally suggests a different way of creating hard locations 
in the memory despite the Kanerva's static model. 

Keywords — SDM, Kanerva, memory, pattern recog- 
nition, signal. 

I Introduction 

In 1984 Pentii Kanerva introduced the Sparse Dis- 
tributed Memory trying to modelize the human mem- 
ory, in particular the long term memory. The idea was 
that different concepts in our minds are as two points in 
a high-dimensional space, where the distance is higher 
as the concepts are more different. The SDM consists 
in a reasonable large number of memory locations (hard 
locations) randomly distributed in throughout the ad- 
dress space {0, l} n where n is the lenght of the patterns. 
Furthermore at every location there is a vector of coun- 
ters initialized to with the same lenght of the patterns 
(there is a counter for every bit of the address). In Kan- 
erva's paper a pattern can be memorized in a location 
different from the binary string representing the pattern 
itself, anyway in this article a pattern to be stored always 
addresses itself: patterns are self- addressing. 

The storage of a pattern consists in updating the vec- 
tor of counters of every locations which are at a lower 
distance than a selected one called radius. With dis- 
tance between two binary strings is intended the Ham- 
ming distance, which is the number of different bits be- 
tween those. When a location is whithin the hypersphere 
centered on the input pattern, the counters of every lo- 
cations are updated as follows: a 1 in the input pattern 
increases by 1 the value of the counter at the correspond- 
ing position in the vector while a decreases the value 
by 1. 

The retrieval consists in summing all the vector of 
counters whose addresses are within the hypersphere 
centered on the retrieval pattern, and in applying the 
Kronecker's delta function to every bit of the sum- vector 



using as threshold. The retrieval can also be iterated 
using the previous recalled pattern. 



Figure 1: A SDM with 6 bit patterns and radius = 1. At the 
top visualizes the storage algorithm and at bottom visualizes 
the retrieval algorithm as explained in the introduction. 

The new approaches explained in this article try to 
overcome the limit of Kanerva's original SDM of han- 
dling efficiently only random data. The main reason this 
problem is important to be solved for is that in the real 
world random data are rare. These new approaches, that 
are mathematically translated in different training algo- 
rithms, are inspired by thinking about human capability 
to recognize inverted patterns and about the utilization 
of both short term and long term memory in recogniz- 
ing a pattern (an odour, a voice, ect.) depending on the 
information a person has. 

Seeing a black logo on a white background is suffi- 
cient for our brain to recognize (most of the times) the 
same logo, but with inverted colors. That said, why a 
SDM should not be able to do the same thing? As in [4] 
the training algorithm presented hereunder use real val- 
ues for the counters, and not integers as in Kanerva's 
model. 

II Maximizing the choice of the hard 
locations from the address space 

To use the memory at its best, the new approach do not 
use the Kanerva's original statical model of creating the 
hard locations, but the construction of the SDM is di- 
namic and the locations are created at each storage of 
a pattern. In details every pattern, before being stored, 
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generates some memory locations in order to maximize 
their usefulness. The memory locations to create in the 
SDM are generated by corrupting the input pattern by 
some given percentage of noise. The percentages of cor- 
ruption depend on the training algorithm and the num- 
ber of generated addresses depends on the size of the 
pattern. In the tests performed by the author, the ad- 
dresses are generated as follows, n indicates the number 
of hard locations and p the percentage of corruption: 



In particular the function used for the strength of the 
signal is a sine wave: 
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The value of corruption is so that the vectors of coun- 
ters of all the addresses are updated, hence there are not 
useless locations. These values can be suitable for Kan- 
erva's SDM, but in the Signal Decay model described 
hereunder the addresses far from the one identified by 
the input pattern are also needed, so the number of vec- 
tors and the percentage of corruption are modified as 
follows: 
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This dinamic way of creating hard locations in the 
SDM guarantees that every locations are used. These 
values are just indicative and could be surely chosen bet- 
ter. Anyway the tests demonstrated that these values are 
already satisfactory comparing the performances of the 
Kanerva's original SDM despite the new models. 

Ill Signal Decay model 

The differences between the signal model exposed in this 
article and the signal model exposed in [4] are: 

• the signal in [4] loses a percentage of strenght at 
every location reached, while here the strength of 
the signal is a function of the Hamming distance 

• the strength of the signal in [4] does not increase 
after reaching 0, while here the minimum strength 
is at d = where d indicates the hamming dis- 
tance, and increaseses again heading to d = n. 
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where m is the percentage of the maximum of the low- 
ered signal. In the tests a value of 0.20 demonstrated 
that the SDM achieves good performances. The graph 
of the function is visible in figure 2. So instead of increas- 
ing or decreasing a counter by 1, a counter is increased 
or decreased by signal (d) where d is the Hamming dis- 
tance between the input pattern and the address of the 
location of the corresponding vector of counters. 
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Figure 2: Graph of the function of the signal strength in the 
Signal Decay model. 



As in [4] the SDM is more flexible having no need to 
select a storage radius. With this Signal Decay model 
not only the nearest locations are rewarded by a stronger 
signal, but also the farest ones. 

The necessity to generate also far locations as ex- 
plained in the previous section, is justified by the utiliza- 
tion of the sine wave function described in (1). Not gen- 
erating those locations will infact not permit the SDM 
to memorize the informations about the complemented 
pattern that needs to be memorized far from the input 
pattern. That said, using for example 20% corrupted 
patterns of a black 'A' on a white background to train 
the SDM, will permit the SDM to retrieve the pattern 
both using a corrupted black 'A' on a white background 
and using a corrupted white 'A' on a black background 
(figure 3). 

This training algorithm removes the need to select a 
radius for the storage, but continues to require a radius 
for the retrieval. Infact the retrieval algorithm is identi- 
cal to the Kanerva's original SDM, the only difference is 
that instead of summing integer values are summed real 
values. 
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Figure 3: Capability of a SDM trained with the Signal De- 
cay model to recognize patterns corrupted both by 20% and 
by 80% even though the training set contained only patterns 
corrupted by 20%. 



IV Tests 

The patterns chosen for the tests are visible in figure 4. 

In all the performed tests the pattern size is 256 and 
the selected radius is 35% of the pattern size, or 89. Ev- 
ery tested SDM has been tested with different patterns 
differing by the Q-factor, where with Q-factor is intended 
the number of l's in any given pattern. The Q-factor val- 
ues chosen for the tests are 32, 64, 96, 128. Every trained 
SDM has been trained with only a pattern, in particular 
using 5 patterns corrupted by 15%, 5 corrupted by 20% 
and 5 corrupted by 25%. 

The tested SDM are: 

• Kanerva's original SDM trained with the original 
algorithm with static creation of hard locations 

• Signal Decay model with dinamic creation of hard 
locations 




Q-factor = 96 Q-factor = 128 



Figure 4: Patterns used for the tests. 



V Results 

Starting to discuss about the results of the tests per- 
formed on the straight-trained SDMs using a retrieval 
pattern with a percentage of corruption between 5% and 
30%, the new modela have better performances despite 
Kanerva's original SDM as it is possibile to see in fig- 
ure 5. The difference in performances are moreover vis- 
ible when the Q-factor is low. Kanerva's SDM cannot 
even recognize the pattern which Q-factor is 128 using a 
corrupted input by 30% (figure 6). It is also important 
to emphasize that the results here exposed refer to the 
number of bit-errors of the first self-addressing pattern 
recalled during the retrieval. The reason behind this 
choice is that during a real retrieval (not a test as in this 
case) the real pattern is unknown for most of the times, 
and it can be convenient to stop the retrieval iteration 
when a self-addressing pattern is recalled. 
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Retrieving with a 5%-10%-15%-20%-25% corrupted pattern 



— Kanerva's SDM 
— Signal Decay model 



Retrieving with a 70%-75%-80%-85%-90%-95% corrupted pattern 
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Figure 5: Results of the test performed on a SDM using a 
retrieval pattern with a percentage of corruption between 5% 
and 25%. Visualizes the better performances achieved by the 
new model despite the Kanerva's SDM. 

Retrieving with a 30% corrupted pattern 
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Figure 6: Results of the test performed on a SDM using a 
retrieval pattern with a percentage of corruption of 30%. 

Retrieving with 35% corrupted pattern 
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Figure 7: Results of the test performed on a SDM using a 
retrieval pattern with a percentage of corruption of 35%. Vi- 
sualizes how the Kanerva's SDM is able to retrieve a pattern 
only for the one whose Q-factor is 96. 

Even in this case where the retrieval pattern is cor- 
rupted by 35% Kanerva's SDM is still the one which got 
the worst performances (figure 7). 

Retrieving with a 65% corrupted pattern 
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Figure 8: Results of the test performed on a SDM using a re- 
trieval pattern with a percentage of corruption of 65%. With 
a that corrupted input retrieval pattern only in the case that 
the Q-factor is 96 the retrieval ended successfully. 



Q-factor 

Figure 9: Results of the test performed on a SDM using a 
retrieval pattern with a percentage of corruption between 70% 
and 95%. 

It can seem strange that with a Q-factor of 128 the 
performances of the SDMs deteriorate a little, but it is 
important to remember that the patterns used both for 
the training and for the retrieval are randomly chosen, 
so that deterioration could be imputated to an unlucky 
combination of choice. 

Going on to analize the results of the tests, in figure 8 
and 9 is possibile to see how the Signal Decay model is 
able to retrieve patterns corrupted by a percentage of 
corruption between 65% and 95% committing less than 
5 bit-errors. Obviously only the Signal Decay model is 
able to retrieve a pattern from a that highly corrupted 
input. 

After the accomplishment of the tests it is possible 
to notice that the new model always have better per- 
formances than Kanerva's SDM. In addition the Signal 
Decay model is able to recognize highly corrupted pat- 
terns. 



VI Discussion 

The purpose of this article is to review Kanerva's SDM 
model, whose charm comes from Kanerva's idea to get 
inspiration from biology trying to emulate human's brain 
behaviour. 

As Denning said in [2] there are many ways to en- 
hance the model of the Spare Distributed Memory, such 
as the capability to recognize patterns independently 
from traslations, rotations or zooms. And I hope that 
this article could arouse the attention for this problems 
using the model of the Sparse Distributed Memory. 

The Signal Decay model presented in this article 
overcome the Kanerva's SDM limit of efflcently handling 
only random data, and give the SDM a characteristic in- 
spired by human's brain capacity such as the recognition 
of inverted pattern. This model is also more flexible and 
dynamic thanks to the utilization of a hard locations 
creation process which depend on the input patterns. 

The desire is that this article could help to make a 
further step towards what in the future will hopefully be 
able to replicate the human memory, and moreover the 
human recognition capability using the promising Kan- 
erva's theory. 
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